Architecture for supporting Hardware Collectives in Output-Queued High-Radix Routers
نویسندگان
چکیده
Collective communication performance is critical for many applications. In this paper, we present an architecture to efficiently support collective operations (like multicasts and reductions) in the switches of parallel computer interconnects. We present an output queuing switch architecture with cross-point buffering. Output queuing architectures have been less popular in the past as they require more internal speedup and buffering. However, with current technology it is straightforward to build output-queued switches. We demonstrate in this paper that output-queued architectures make multicasts and reductions fairly easy to implement efficiently. We show the scalability of our schemes to a large number of switch ports. We present performance of multicasts and reductions on individual switches and networks of switches. We assume a fat-tree topology for the networks of switches. We also present simulation results based on synthetic workloads that emulate a molecular dynamics application.
منابع مشابه
1 Architectures of Internet Switches and Routers
Over the years, different architectures have been investigated for the design and implementation of high-performance switches. Particular architectures were determined by a number of factors based on performance, flexibility and available technology. Design differences were mainly a variation in the queuing functions and the switch core. The crossbar-based architecture is perhaps the dominant a...
متن کاملInput-queued router architectures exploiting cell-based switching fabrics
Input queued and combined input/output-queued architectures have recently come to play a major role in the design of high-performance switches and routers for packet networks. These architectures must be controlled by a packet scheduling algorithm, which solves contentions in the transfer of data units to switch outputs. Several scheduling algorithms were proposed in the literature for switches...
متن کاملA Practical Scheduling Algorithm to Achieve 100% Throughput in Input-Queued Switches
Input queueing is becoming increasingly used for high-bandwidth switches and routers. In previous work, it was proved that it is possible to achieve 100% throughput for input-queued switches using a combination of virtual output queueing and a scheduling algorithm called LQF. However, this is only a theoretical result: LQF is too complex to implement in hardware. In this paper we introduce a ne...
متن کاملBuffer Sizing in a Combined Input Output Queued (CIOQ) Switch
In all internet routers buffers are needed to hold packets during times of congestion. In some recent work, the question of finding the minimum buffer size guaranteeing high throughput has been addressed [3] [6]. The answer to this question is particularly important in building all-optical routers, where the optical technology allows buffering up to a few dozen packets [7]. While in practice mo...
متن کاملA Practical Scheduler For High-Speed Packet Switches and Internet Routers
The input queued (IQ) crossbar based switching, employing virtual output queueing (VOQ), is the dominant architecture for high-performance packet switches. The performance of a VOQ switch depends solely on the scheduling algorithm used. Maximum Weight Matching (MWM) algorithms have optimal performance however they are not practical due to their hardware complexity. Round Robin (RR) based algori...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005